Skip to content

Fix tascCODA always returning insignificant results#1016

Merged
Zethson merged 2 commits into
mainfrom
fix-tasccoda-nan-medians
Jun 10, 2026
Merged

Fix tascCODA always returning insignificant results#1016
Zethson merged 2 commits into
mainfrom
fix-tasccoda-nan-medians

Conversation

@Zethson

@Zethson Zethson commented Jun 10, 2026

Copy link
Copy Markdown
Member

What

Fixes tascCODA always returning insignificant results, plus a run_hmc crash.

1. NaN node-effect medians (the main bug)

The arviz 1.0 migration (#911) recomputed the posterior median in summary_prepare via:

summ["median"] = posterior[var_names].median(dim=["chain", "draw"]).to_dataframe().stack().reindex(summ.index)

.to_dataframe().stack() produces a tuple MultiIndex (e.g. ("A", "x", "n0", "alpha")) that never matches arviz's string row labels (e.g. "b_tilde[x, n0]"), so reindex(summ.index) fills the entire median column with NaN.

tascCODA decides node credibility with |median| > delta (__complete_node_df), so NaN medians made every effect non-credible regardless of the data — i.e. "always insignificant".
scCODA is unaffected because it selects effects via inclusion probabilities, not the median, which is why only tascCODA broke.

The fix computes the medians positionally, which matches az.summary's row order exactly (rows ordered by var_names, then C-order of each variable's dims — identical to DataArray.values.flatten()).

2. run_hmc rng-key TypeError

run_hmc converted rng_key to a JAX key before handing it to set_init_mcmc_states, which seeds a NumPy generator and therefore needs an int — raising TypeError: SeedSequence expects int ....
It now keeps the int seed for the NumPy generator and derives the JAX key separately, mirroring run_nuts.

Not included here

While verifying this, I found a deeper, separate problem: even with medians fixed, the (correct) spike-and-slab model recovers no credible effects because the global theta collapses to its prior under numpyro's samplers. That needs a model-level change and is tracked in #1015 with the full analysis.

Tests

No new test is added: the meaningful assertion (tascCODA recovers known credible effects) is blocked by #1015, and asserting only that medians are non-NaN isn't worth pinning.
Existing tests/tools/_coda/test_tasccoda.py passes, and run_hmc now runs end-to-end.

The arviz 1.0 migration (#911) recomputed the posterior median in
`summary_prepare` via
`posterior[var_names].median(...).to_dataframe().stack().reindex(summ.index)`.
That `stack()` yields a tuple MultiIndex (e.g. `("A", "x", "n0", "alpha")`) that never matches arviz's string row labels (e.g. `"b_tilde[x, n0]"`), so every median silently became NaN.
tascCODA decides credibility with `|median| > delta`, so NaN medians made every node effect non-credible regardless of the data.
scCODA is unaffected because it selects effects via inclusion probabilities rather than the median.
Compute the medians positionally instead, which matches `az.summary`'s row order (by `var_names`, then C-order of each variable's dims) exactly.

Also fix `run_hmc`, which passed a JAX PRNG key into `set_init_mcmc_states` where `np.random.default_rng` expects an int seed, raising `TypeError`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Lukas Heumos <lukas.heumos@posteo.net>
@Zethson Zethson force-pushed the fix-tasccoda-nan-medians branch from 2602efb to 6fa34b6 Compare June 10, 2026 10:05
@github-actions github-actions Bot added the bug Something isn't working label Jun 10, 2026
@Zethson Zethson force-pushed the fix-tasccoda-nan-medians branch from 060551b to 30a4912 Compare June 10, 2026 11:12
Two independent breakages in the v6 upload step:

1. Auth: `use_oidc: true` needs an `id-token`, which GitHub won't issue for
   fork PRs, while tokenless upload is rejected for the repo's own branches
   ("Token required because branch is protected"). The only config covering
   both is token auth with a tokenless fork fallback: pass CODECOV_TOKEN, which
   authenticates same-repo runs, while fork PRs get an empty secret and fall
   back to tokenless (the case tokenless is actually allowed). Drop `use_oidc`.

2. GPG: intermittent "Can't check signature: No public key" came from the
   uploader fetching its signing key from Codecov's deleted keybase account.
   v7 migrates the key to `codecovsecops`, fixing verification at the source.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Zethson Zethson force-pushed the fix-tasccoda-nan-medians branch from 30a4912 to 70c4689 Compare June 10, 2026 11:54
@codecov-commenter

codecov-commenter commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.81%. Comparing base (12897e1) to head (70c4689).
⚠️ Report is 91 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1016      +/-   ##
==========================================
+ Coverage   73.54%   77.81%   +4.26%     
==========================================
  Files          48       50       +2     
  Lines        5613     6580     +967     
==========================================
+ Hits         4128     5120     +992     
+ Misses       1485     1460      -25     
Files with missing lines Coverage Δ
pertpy/tools/_coda/_base_coda.py 59.71% <100.00%> (+2.54%) ⬆️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Zethson Zethson merged commit 8e99ae4 into main Jun 10, 2026
18 of 19 checks passed
@Zethson Zethson deleted the fix-tasccoda-nan-medians branch June 10, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants